Loss Bounds for Uncertain Transition Probabilities in Markov Decision Processes
ثبت نشده
چکیده
We analyze losses resulting from uncertain transition probabilities in Markov decision processes with bounded nonnegative rewards. We assume that policies are pre-computed using exact dynamic programming with the estimated transition probabilities, but the system evolves according to different, true transition probabilities. Our approach analyzes the growth of errors incurred by stepping backwards in time while precomputing value functions, which requires bounding a multilinear program. We present loss bounds for the finite horizon undiscounted, finite horizon discounted, and infinite horizon discounted cases. Loss bounds are given in terms of the maximum total variation error of all transition probabilities, maximum reward, number of stages, and discount factor. A tight example is given for the finite horizon undiscounted case.
منابع مشابه
Model-Checking Markov Chains in the Presence of Uncertainties
We investigate the problem of model checking Interval-valued Discrete-time Markov Chains (IDTMC). IDTMCs are discrete-time finite Markov Chains for which the exact transition probabilities are not known. Instead in IDTMCs, each transition is associated with an interval in which the actual transition probability must lie. We consider two semantic interpretations for the uncertainty in the transi...
متن کاملRobust Control of Markov Decision Processes with Uncertain Transition Matrices
Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems. We consider a robust control problem for a finite-state, finite-action Markov dec...
متن کاملRobust Markov Decision Processes with Uncertain Transition Matrices
Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems. We consider a robust control problem for a finite-state, finite-action Markov dec...
متن کاملDiffusion Approximation for Bayesian Markov Chains
Given a Markov chain with uncertain transition probabilities modelled in a Bayesian way, we investigate a technique for analytically approximating the mean transition frequency counts over a finite horizon. Conventional techniques for addressing this problem either require the enumeration of a set of generalized process "hyperstates" whose cardinality grows exponentially with the terminal horiz...
متن کاملAnalysis of approximation and uncertainty in optimization
We study a series of topics involving approximation algorithms and the presence of uncertain data in optimization. On the first theme of approximation, we derive performance bounds for rollout algorithms. Interpreted as an approximate dynamic programming algorithm, a rollout algorithm estimates the value-to-go at each decision stage by simulating future events while following a heuristic policy...
متن کامل